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Oh 1 Abstract 

A popular approach in combinatorial optimization is to model problems as integer lin- 
ear programs. Ideally, the relaxed linear program would have only integer solutions, which 
happens for instance when the constraint matrix is totally unimodular. Still, sometimes it is 
possible to build an integer solution with the same cost from the fractional solution. Exam- 
ples are two scheduling problems 4, 6 and the single disk prefetching/caching problem [3]. 
We show that problems such as the three previously mentioned can be separated into two 
subproblems: (1) finding an optimal feasible set of slots, and (2) assigning the jobs or pages 
to the slots. It is straigthforward to show that the latter can be solved greedily. We are able 
to solve the former with a totally unimodular linear program, from which we obtain simple 
combinatorial algorithms with improved worst case running time. 

CO 
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; 1 Introduction 

In this work, we propose a specific approach to give simpler solutions to several optimization 
problems. Herein we considered three such optimization problems: the first two are scheduling 
problems : the Tall-Small Jobs Problem g] and the Equal Length Jobs Problem [5J. 
The last one is about Offline prefetching and caching to minimize stall time [3J. In 
the Tall-Small Jobs Problem, we have m machines, n unit length jobs, some of which need 
to execute on all the machines at the same time. In the Equal Length Jobs Problem, jobs 
have a given equal length p > 1 and each job executes on a single machine. In both problems 
jobs have given release times and deadlines in between which they need to execute. The goal is 
to find a feasible schedule, and moreover, for the equal length jobs problem, a feasible schedule 
that minimizes total completion time of the jobs. The third optimization problem, Offline 
prefetching and caching TO minimize stall time belongs to a different field: we are given 
a sequence of n page requests and a cache of size k. We can evict a page from the cache and fetch 
a new page to replace it. This operation cannot be done in parallel and costs F time units. When 
a page request is served it costs 1 time unit, unless the page is not yet in the cache, then a stall 
time is generated until the corresponding fetch completes. The goal is to decide when to evict and 
fetch pages so as to minimize the total stall time. 

Though quite different, those three problems were solved in a similar manner. Unlike previous 
works where the authors transform the solution of a relaxed integer linear program into an integer 
solution, we used a new technique which simplifies the linear programs, and allows us to get directly 
optimal integer solutions: our approach is based on the observation that only the structure of the 
solution matters in the objective function, jobs and pages don't appear namely. Therefore, we 
completely dissociate the resolution process into two phases. First a simplified linear program 
can be used to find an optimal skeleton for the solution, and it is only later that we need to 
worry about assigning jobs or pages to this skeleton: for scheduling problems, the skeleton is 
a sequence of slots, and the assignment maps jobs to slots; for the cache problem, the skeleton 
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is a sequence of intervals and the assignment associates to every interval a page to evict at the 
beginning and a page to fetch at the end. Our skeletons are such that the assignment phase just 
comes down to running a greedy algorithm. Our contribution is that this strategy, where you 
don't compute the assignment in the linear program, leads to linear programs with very simple 
constraint matrices, which not only are totally unimodular, but are (the transpose of ) directed 
vertex adjacency matrices. 

This allows us to reduce our scheduling problems into a shortest path problem and to reduce 
the caching problem into a min cost flow problem. Here is roughly our approach (see figure [1]) . 
The original linear programs have the following structure. The polytopes described by the linear 
programs have fractional vertices, but for the required objective all optimal vertices are integral. 
Now we project the solution space to a lower dimensional space. This is done by partitioning 
the variables, and replacing every set S of variables the original linear program by a single new 
variable that represents the sum over S. The nice thing is that the resulting linear program is 
now totally unimodular, and corresponds to shortest path or a min cost flow problem. 

As a result, the tall/small job scheduling problem and the prefetch/caching problem can be 
solved in worst case time 0(n 3 ) improving over respectively 0(n 10 ) [4j and 0*(n 18 ) [2]. Imple- 
mentations are available from the authors home-pages. 



direction of objective function O fractional vertices 
• integral vertices 




Figure 1: Intuition of our approach 



2 Scheduling equal length jobs 

We will first introduce our method on a basic scheduling problem. We have n jobs, each of the 
same length p. Every job j £ [1, n] comes with an interval [rj , Dj] consisting of a release time and a 
strict deadline. The goal is to find a schedule on m parallel machines, such that each job is assigned 
to an execution slot consisting of a particular machine and a time interval [sj, Sj + p) C [rj, Dj]. 
In addition, all execution slots assigned to a particular machine must be disjoint. One possible 
application could be frequency allocation. A network operator has a link with m optical fiber 
strings. Users ask for allocations of a frequency band of fixed size, inside the large frequency band 
that the particular user devices can handle. The goal is to find an assignment which satisfies all 
users. In addition we want to find the solution (if it exists) that minimizes the total completion 
time of the jobs. In the standard Graham notation, this problem is called P\rj)Pj — p; Dj \ ^ Cj. 

Simons [10] give a complicated greedy-backtrack algorithm running in time 0(n 3 log log n), and 
later improved to 0(mn 2 ) [TT]. Recently Brucker and Kravchenko [5] gave another algorithm 
for it, using a completely different approach. While their algorithm has worse complexity it is 
interesting because of a generalization which permits to solve an open problem, namely minimizing 
the weighted total completion time, where jobs are given priority weights. 

A generalization of the feasibility problem is to find a maximal set of jobs, which can all be 
scheduled between their release times and deadlines. This problem is still open. Even the more 
general problem, when jobs come with a weight, and the goal is to find a maximal weighted feasible 
job set, is not known to be NP-hard. 
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2.1 Previous work 



First we observe that without loss of generality we can restrict ourselves to schedules where each 
execution slot starts at some release time plus a multiple of p, simply by shifting each slot as 
much to the beginning as possible. Let T = {r^ + (a — l)p : 1 < i, a < n} be this set of time 
points. And finally for a fixed schedule, if we number the execution slots from left to right, we can 
always reassign the j-th slot to the machine (j mod m) + 1. This way we don't need to take care 
of which machines the slots are assigned to, as long as there are at most m slots starting in every 
time interval of size p, which ensures that slots don't overlap on a particular machine. The linear 
program of [6] has a variable x 3 t for each job j and time t e T, with the meaning that Xjt — 1 if 
job j is executed in the slot [t,t +p). Then the program is to minimize ^2j t (t + P) x jt subject to 

Vj G [l,n] 

Vj G [l,n],Vte7\[r i ,D j -p] 

Vs g r 

It is quite clear that there is an integer solution to this linear program if and only if there is 
a feasible schedule. While this linear program is not totally unimodular, the authors of [6] were 
still able to round the fractional solution into an integer solution of the same cost. 

2.2 Relaxing the linear program 

The linear program above computes not only the time slots of the schedule, but also the assignment 
of jobs to slots. However once we are given the skeleton of a schedule, meaning a set of time slots, 
it is always possible to assign the jobs greedily in EDD fashion: assign to every slot the job with 
smallest deadline among the available jobs. We release the linear program from the job assignment, 
in order to obtain a simpler linear program which only computes a feasible skeleton. 

We proceed in several steps. First we weaken equation ( every job completes) into the inequality 
Y] f gT Xjt > 1. Then combining this new constraint with (allowed interval) leads to 

Vjefl.n]: J2 ■■ x Jt>l- (1) 

te[ rj ,Dj-p] 

Now for every pair s,t G T, s < t we sum (TTJ) over all jobs j that have [rj,Dj — p] C [s,t], 
upper-bounding the left hand side we obtain 

Vs,t€T,s<t: ^ ^2x ja >>\{i:[n,Di-p]C[a,t]}\. (2) 
s'e[s,t] j 

The constraints are clearly necessary, and we will show later they are also sufficient to get 
the optimal solutions. We reduce the number of variables and group £^ • Xj t by setting y t 
~l2s<t x jt- Now y t represents the total number of slots up to time t. To simplify notations we 
introduce an additional time point to < minT, and set T = T U {to}. For any time t > to, we 
define the functions round(t) := max{s G T : s < t} and pree(t) := max{s G T' : s < t}. 

minimize E te r(* + P)(Vt ~ 2/ P rec(t)) 
subject to 

Vto = °> VmaxT -yt <n 

Vt G T , s = prec(t) : y s — yt < (order) 
Vs G T, t = round(s + p) : y t — y s < m (load) 
Vi, j G [1,ti],s = prec(rj),i = round(D :) - p), s <t : y t - y s > c ijy (inch) 

where Cy := \{k : [r^, Dk] Q [ri, Dj]}\ is the number of jobs which have to be executed 
in the interval [r», Dj]. 



teT 
■ x 3 t = 

s<t<s+p j£[l,n] 



(every job cmpl.) 

(allowed interval) 
(no overlapping) 
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The two first inequalities force yt at the first and last time steps. In fact they are not necessary, 
but simplify the proof. The order inequalities ensure that (i/t) is a non decreasing sequence. 
The load inequalities verify that there are never more than m slots overlapping, and the inclusion 
inequalities, are there to ensure that there is a feasible mapping from jobs to slots, as we show next. 
Except the equality y to = 0, the linear program has in every constraint exactly two variables, and 
with respective coefficients +1 and — 1. So the dual of the constraint matrix is the incidence matrix 
of a directed graph, which means the constraint matrix is totally unimodular. Now this property 
is preserved when adding a row with a single +1 entry, corresponding to y to = 0. Therefore our 
linear program's constraints are in the from Ay < b with A totally unimodular and b integer. This 
means that if the linear program has a solution then there is an optimal integer solution. 

Let (y t ) be an optimal integer solution to this linear program. It indeed defines the skeleton 
of a solution: at each time ( £ T there will be y t — y pre c(t) slots available for scheduling jobs. 
The standard greedy job-slot assignment, is defined as scheduling at every slot the job with the 
smallest deadline among the jobs that are not yet completed and already released. 

Lemma 1 The greedy assignment produces a valid schedule. 

Proof: We can notice that according to the second condition and the inclusion condition on 
[to , maxT] , y ma xT = n. We define V to be the multiset of time slots, such that slot [t, t + p] is 
contained yt — y pre c(t) times. Therefore \V\ — n. As mentioned in the previous section, by the 
load inequality, the slots can be assigned to machines without overlapping. So, it only remains to 
show that there exist assignments of jobs to slots, which respect release times and deadlines, and 
then that the greedy assignment is one of them. 

Let U be the set of n jobs, and G(U, V, E) a bipartite graph where E contains all edges between 
a job j and a slot [t,t + p] if [t, t + p] <E [rj ,Dj]. We have to show that this graph has an injection 
from U to V, and will use Hall's theorem for this, see [5] or [5]- 

For a set of jobs S, we denote the neighboring slots dS, as the set of all slots t such that there 
is a job j G S with (j,t) E E. We need to show that for every set S, \S\ < \dS\, which by Hall's 
theorem, characterizes the existence of an injection. Let S be a set of jobs. Suppose S can be 
partitioned into Si U S2 such that for any jobs i (E Si and j £ S2 the intervals [ri,Di — p] and 
[rj, Dj — p] are disjoint. Then clearly dS is the disjoint union of dSi and dS2- Therefore we can 
without loss of generality assume that Ujes[ r j: Dj] is a unique interval [r^, Dj], for i = argmin igS ri 
and j = argmax^-ggDj. Then \S\ < c^j. Also the number of slots in the interval [r iy Dj) is exactly 
Ut ~ Us for s = prec(r i ),i = rowad(Dj — p). From the inclusion inequality we get the required 
inequality and we conclude that there exist a valid assignment. Now since \V\ = \U\ = n, the 
injection is in fact a bijection, and there exists at least one perfect matching from jobs to slots 
with respect to release times and deadlines. 

Proving that you can permute jobs in any of these matching to get the greedy matching is a 
quite standard in scheduling: let be two jobs i,j with Di < Dj, and i is scheduled at some time 
t, while j is scheduled at some time s with n < s < t. Then it is possible to exchange the jobs 
i, j in their execution slots [s, s + p) and [t, t + p). By the use of a potential function, decreasing 
at each exchange, it is possible to transform our schedule in a so called earliest due date schedule. 
We conclude that since there exists at least a valid assignment, the greedy assignment is valid as 
well. □ 

This means that an optimal integer solution can be found with a standard linear program 
solve. But our linear program describes in fact the dual of a minimum cost flow problem, with 
uncapacitated arcs, and a single supply node, which corresponds to a shortest path problem and 
can be solved in time O(NM), where N is the number of variables and M the number of constraints 
[El p.558]. 

Theorem 1 Our algorithm solves P\rj;pj — p; Dj \ Cj in worst case time 0(n A ). 

Proof: Given the instance m,p, 7*1, . . . , r n , D\, . . . , D n , we construct the set T of 0(n 2 ) time points. 
Then we compute for every pair of jobs i,j the number of jobs Cij which need to be scheduled 
in [ri,Dj]. A naive algorithm does it in time 0(n 3 ), which would be enough for us. However it 
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can be solved in time 0{n 2 ) using the following recursive formula. We assume jobs are indexed in 
order of release times. For convenience we set c„+i j = 0. Then Cij — Ci+i,j + 1 if -Dj < Dj and 
Cij = Cj+ij if D l > Dj. 

This permits to construct the graph G and find in time 0(n 4 ) the optimal solution to the 
linear program, if there is one. Finally we do an earliest due date assignment of the jobs to the 
slots defined by the solution to the linear program in time 0(n log n) using a priority queue. □ 

Note that in this section we don't beat the best known algorithm for P\rj\pj = p; Dj \ Y^, Cj 
which is 0(mn 2 ) [11 . However, it allows us to introduce our technique that will be used later on. 

3 Scheduling tall and small jobs 

feasible schedule 
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Figure 2: Example for 3 machines 

In a parallel machine environment, sometimes maintenance tasks are to be done which involve 
all machines at the same time. Think of business meetings or inventory. Formally we are given n 
jobs of unit length p = 1, each job j comes with an integer release time and a deadline interval 
[rj,Dj] in which it must be scheduled. We distinguish two kinds of jobs. The first ri\ jobs are 
small jobs, in the sense that they must be scheduled on one of the m parallel machines, it does 
not matter which one. The n<x — n — n\ remaining jobs are tall jobs, in the sense that they must 
be scheduled on all the m machines at the same time. 

A time slot is an interval [t, t + 1) for an integer boundary t. The goal is to find a feasible 
schedule, where each tall job is assigned to a different time slot, and each small job is assigned 
to a different (machine, time slot) pair for the remaining time slots. In addition the time slot to 
which some job j is assigned must be included in [r,-, Dj]. 

This problem has been solved by Baptiste and Schieber with a linear program using 0(n 2 ) 
variables and 0(n 2 ) constraints. The linear program is not totally unimodular, however they 
manage to show that for the particular objective function it always has an integer solution. We 
provide a linear program using only O(n) variables but still 0(n 2 ) constraints, but whose constraint 
matrix is the incidence matrix of a directed graph, and can be solved in time 0(n 3 ) with a shortest 
path algorithm. 

Baptiste and Schieber showed that we can assume that the time interval ranges only from 1 
to n, otherwise the problem could easily be divided into two disjoint subproblems. 

In a similar way than before, we will denote by x% the total number of time slots assigned to 
tall jobs in [l,f + 1]. For convenience we set xq = 0. The number of small jobs that must be 
scheduled in [s,t] is k s t = \{j : j < ni, [rj,Dj] C [s,t]}| and the same for tall jobs is £ s j = \{j : 
j > ni, [rj, Dj] C [s, t]}\. Consider the following linear program, which does not have an objective 
value. 
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Mt G [1, n] : xt-x < x t (3) 

Vi e [l,n] :xt -Xt-! < 1 (4) 

Vs, t 6 [l,n], s < t : 5E t _x - > t„ )t (5) 

Vs, i e [1, n], s < t : xt-i — x s -i <t — s — [~fc s ,t/m] . (6) 



Inequalities ^ make sure that (x t ) is a non decreasing sequence, @ that only one tall job 
can be scheduled per unit-length interval, ([5]) that there are enough slots for the tall jobs and ^ 
that there are enough remaining slots for the small jobs. 

Once again, the transpose of the constraint matrix is the adjacency matrix of an oriented 
graph, and the constant vector b is integer. As previously, it has optimal integer solutions. 

Theorem 2 Fix an instance of the tall/small scheduling problem. There is an integer solution to 
this linear program if and only if there is a feasible schedule. 

Proof: It is quite obvious that fixing (x t ) according to any feasible schedule will satisfy the 
constraints. 

For the hard direction, let (xt) be a solution to the linear program, we know it is integer. Then 
Xt — Xt-i — which can be or 1 — is the number of slots for tall jobs at time t. We will again use 
Hall's theorem to show that there is a valid assignment of the ri2 tall jobs to these slots. Inequality 
([5|) for [s, t] = [1, n] forces x n > Now let be G(U, V, E) the bipartite graph, where U are the n 2 
tall jobs, and V the x n slots. There is an edge between job j and time slot [t, t + 1] if it is included 
in [rj, Dj]. We have to show that for every subset S C U, the number of neighboring slots in V is 
at least \S\. Let s be the smallest release time among S and t be the largest deadline among S. 
Again it is sufficient to show this claim for connected sets S in the sense that Uj E s[rj, Dj] = [s,t]. 
Now |5| < £ Sj t < Xt-i — x s -i, where the last expression is the number of slots in [s,t]. This 
completes the claim that there is a valid assignment from tall jobs to the slots. 

For the small jobs, note that a s ,t '■= (t — s) — (xt-i — x s -i) is the number of remaining slots 
in [s, t] which are not assigned to tall jobs, and a Si t ■ m small jobs can fit in that interval. Again 
inequality ([6]) implies k s j < m ■ a s ,u an d Hall's theorem shows that there is a valid assignment of 
small jobs to the remaining slots. □ 

In the original paper [4] the author gave a linear program which is solved in expected time 
0(n ) and worst case time 0(n 10 ). Using the transformation into a shortest path problem allows 
us to improve this complexity. 

Corollary 1 The tall/small scheduling problem can be solved in worst case time 0(n 3 ). 

Proof: As in the second section, we have a linear program with 0(n) variables and 0{n 2 ) con- 
straints which can be produced in time 0{n 2 ). We just take an arbitrary objective function in 
which all the variable coefficients are positive, and build the associated graph as in the previous 
section. Then we compute the all shortest paths from the source xq, in time 0(n 3 ). If this compu- 
tation detects a negative cycle, then the problem has no solution. Otherwise, we get the skeleton 
of a solution to the problem that minimize the total completion time of the tall jobs. Finally if 
there is a solution, the standard earliest due date assignment, first of tall jobs, then of small ones, 
produces a valid schedule in time 0(n log n). □ 

Here again, a direction that we are still exploring is to find another shortest path algorithm 
inspired from better fitted for these specific graphs, that could improve this complexity. 

4 Prefetching 

Caches are used to improve the memory access times. In this context the memory unit is called 
a page, and is stored on a slow disk. The cache can store up to k pages. Now if a page request 
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arrives, and the page is already in the cache, it can be served immediately, otherwise it must first 
be fetched from the disk, and that introduces a stall time of F units. In the latter case the new 
page replaces some other page currently in the cache. The idea of prefetching is to fetch a page 
even before it is requested, so as to reduce the stall time: During a fetch which evicts some page 
y replacing it by some page z, other requests can be served for pages currently in the cache and 
different from y or z. In the single disk model we consider here, only a single fetch can occur at 
the same time. The goal is, knowing in advance the complete request sequence, to come up with 
a prefetch schedule, which minimizes total stall time. 



requests 
fetch intervals 
evicted, entering pages od 
stall time 4 
cache content 
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Figure 3: An optimal prefetching for a cache of size k = 3, and a fetch duration F = A. 



While the real life problem is on-line, and has been extensively studied by Cao et al. [7], the 
offline problem has first been solved in 1998 [3], by the use of a linear program, for which it was 
shown that it always has an optimal integer solution, while not being totally unimodular. Later 
in 2000 [2], a polynomial time algorithm was given modeling the problem as a multi commodity 
flow with some postprocessing. Formally the problem can be defined as follows. 

The Offline Prefetching problem The input is a page request sequence x\, . . . , x„, an initial 
cache set Ci, and a fetch duration F. Let k — \C\ \ be the cache size. A fetch is a tuple (s, y, e, z), 
where y, z are pages and s,t £ [l,n] are time points with s < e < s + F. The meaning is that 
at time s, the page y leaves the cache and at time e the page z enters the cache. Its cost, the 
induced stall time, is F — (e — s). The goal is to come up with a fetch sequence minimizing the 
total stall time, such that two fetches intersect in at most one time point, and such that every 
request can be served, i.e. Vi £ [1, n] : x t 6 C t , where C t is the cache at time t obtained from C t -i 
by evicting/fetching all the pages that had to be evicted/fetched at time t. To simplify notation 
we assume that the request sequence contains at least k distinct pages, that C\ consists of the 
first k distinct requests, and that at time 1, no page has left/entered the cache yet. 

Albers, Garg and Leonardi defined a linear program with a characteristic variable for every 
fetch interval [s, e], and two additional characteristic variables for every pair (y, [s,e]) indicating 
whether page y enters (resp. leaves) the cache at the beginning (resp. the end) of the fetch 
[s,e]. Finally they show that the linear program has always an integer solution for the considered 
objective function. 

As observed in [3] without loss of generality the page to be evicted at time t from the cache 
C t ~\ is the page, whose next request is furthest in the future or which is never requested again. 
Also without loss of generality the page to be fetched at time t is the page whose next request 
starting from t is nearest in the future. Therefore all the information about the fetches is in the 
time intervals, and we will write a linear program which produces only the time intervals in which 
evictions/fetches occur. The actual pages have to be assigned in a post processing, in greedy 
manner as just mentioned. Rather than having a single variable for every interval and every page, 
we only count how many pages entered and how many left the cache in total since the beginning. 
This leaves us with 0(n) instead of 0(n 2 F) variables. We denote by It (resp. Ot ) the total 
number of pages which entered (resp. left) the cache up to time t included. We get the following 
linear program. 

minimize FO n - FIi - X^ =1 (0* - h), 
subject to 
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V* S [2, n] : O t -x < O t and < I t (7) 

Vi € [l,n] :O t > It (8) 

Vfc € [l,n] :O t < J t + 1 (9) 

Vie [l,n] :/ mi „( H F,n} >Ot (10) 

VI < s <t <n:/ t -O s > |{aj a ,aj a+ i,... > ar t }|-* (11) 

Inequalities make sure that (Ot) and (it) are non decreasing sequences, (j8|) that the cache 
cannot overflow, (|9|) that two fetches don't overlap in time, (fT0|) that a fetch length is at most F 
and (fTTj) that there are enough fetches to serve all requests. Note that \{x s ,x s+ i, . . . , x t }\ denotes 
the number of distinct page requests in [s,t], so (llip is the part of linear program that depends 
on the actual problem instance. 

The optimal solution of this linear program is always integer, since it is totally unimodular 
(for the same reason as in previous section: its constraint matrix the transposed incidence matrix 
of a directed graph.) 

Theorem 3 Let (It,Ot) be an optimal integer solution to the linear program. Then there is valid 
fetch sequence of the same cost, which can be built by greedy assignment. 

Proof: First we observe that the cost function makes sure that O n = I n , which ensures that 
all interval are eventually closed. The solution defines m — O n intervals as follows. For every 
j = 1 . . . m, let Sj be the smallest time such that O s > j and ej the smallest time such that 
le > j- Then by (JSJ) and (TIT)]) we have Sj < ej < Sj + F. Which means that all intervals [sj, ej) 
are well defined and of length smaller or equal than F. Now by ©, ej < s J+ i (otherwise, we 
would have I e . + 1 > O ej > Sj+1 but I e . = j and Sj+1 = j + 1 by definition.), and this for all 
j < m, so the intervals do not overlap (but the ending point of one might be the starting point 
of another). Moreover the objective value of (It, Ot), equals the total stall time of these intervals, 
for at each time t, the difference Ot - It is 1 if an interval is currently opened and is otherwise. 
It remains to prove that the greedy assignement of pages to evict /fetch to each interval is such 
that all requests are served, i.e. it remains to show that the constraints (fl"Tj) are sufficient. We 
denote by C s the cache obtained at time s, after all entrances and evictions that occur at time s. 
We will show that the following invariant holds in a solution of our linear program for every time 
s G [1, n], 

\/te[s,n}:I t -I s >\{x s ,...,x t }\C s \. (12) 

The invariant implies that if the number of pages requested in [s,t] but not in the cache at time 
s is a, then at least a pages must enter the cache somewhere in [s + 1 , t] . In particular it means 
for t = s, that the page requested at time s will be the in the cache at that moment. The proof 
of (fT2|) is by induction on s. 

Basis case s = 1 Let to be the greatest request time such that xt is not in C\. Then by the 
assumption that initially the cache contains the first k distinct requests, we have that for t < to, 
{xi, ■ ■ ■ , x t } C Ci. So the right hand side of |QJ]) is and ([12]) holds by (7J. For t > t , since the 
intersection of {x\, . . . , Xt} and C\ is exactly k, the invariant holds by (jTTJ) and 0\ = I\ = 0. 

Induction case Assume the invariant holds for some s. Let's show that it also holds for s + 1. 
Several things can happen at time s + 1. Pages can leave the cache and pages can enter the cache. 
We will do these operations step by step, transforming I s into 7 s +i and C s into C s +i, and show 
that each step preserves the invariant (fT2|) . 

By induction hypothesis x s £ C s , so {x s , . . . , Xt}\C s = {x s +i, . . . , Xt}\C s . Therefore, if noth- 
ing happens and no page enters or leaves the cache, then C s+ \ = C s , I s = I s -\-\ and the invariant 
is preserved for s + 1. 

Now we deal with the case when there is some page movement at time s + 1, that is I s +i > I s 
or O s +i > O s or both. We artificially decompose this page movement in as many times as needed, 
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so that at each time there is only one operation happening: a fetch or an eviction. The page 
movements at those intermediary times are set so as to alternatively evict and enter pages, among 
the O s +i — O s pages to evict and the I s +i — I s pages to enter. Of course if a fetch is pending at 
time s, that is \C S \ — k — 1, then we start with entering a new page and otherwise if the cache is 
full, i.e. \C S \ — k, we start with evicting a page. Since the number of total entrances and evictions 
up to some time can differ by at most one, it is possible to do so. Therefore, we need to do the 
induction case only in the case when a page is entering the cache or when one is leaving the cache 
but not both. 

When page is entering the cache, we have I s +\ = I s + 1. Let z be the page entering the cache, 
and let to > s + 1 be the next request time of z. Then if t < to, by the choice of z, all requests of 
x s+ i, . . . , x t must be in C s+ i, so the right hand side of (|12p at time s + 1 is 0, and the inequality 
holds by ([7]). Now if t > t , since z 6 C s+ i but z ^ C s , the left hand side of (|12p at time s + 1 
has decreased by 1 compared to time s, but at the same time I s +i = I s + 1, so both sides of the 
invariant decrease by 1 and by induction the inequality is preserved at time s + 1. 

Now consider the case when a page leaves the cache. Let y be the leaving page. Then I s = I s +x 
and O s +i = O s + 1. Let to be the next request time of y or let to = n + 1 if y is never requested 
again. Then if t < to, removing y from C s +\ does not change the right hand side of (|12p when 
replacing s by s + 1 . The left hand side does not change either since no page enters the cache, and 
the inequality is preserved. For t > to however by the choice of the evicted page y, we have that 
C s +i (= {x s +i, ■ ■ ■ , xt}- So the left hand side of (fT2"|) at time s + 1 is . . . , x t }\ — {k — 1), and 

I s +i = O s +i — 1 since we have just evicted a page. Therefore, (fT2j) holds by (fTTj) . □ 

Theorem 4 The offline prefetch problem can be solved in time 0(n 3 ) if F = 0{n) and in time 
0(n 3 log n) otherwise. 

Proof: First we observe that the 0(n n ) different righthand sides of (fTTj) can be computed in time 
0(n 2 logn) using dynamic programming and a search tree, so solving the linear program is the 
bottleneck of the algorithm. 

The dual of the linear program is a min cost flow problem with uncapacitated arcs, where the 
supply/demand bi of the nodes i are given by the coefficients in the cost function and where the 
arc costs Cjj are given by right hand sides of the inequalities, see figure [H 

-1 -1 -1 -1 -1 -1 -1 -1 -1 +F 

@ @ @ @ (07) @) @ @) 

I{x2,...,x5)l-k 1 

© ® © ® © ® ® ® 

-F +1 +1 +1 +1 +1 +1 +1 +1 +1 

Figure 4: Min cost flow problem instance (not all arcs shown). 



It could be solved in time 0(n 3 logn) using [9]. To solve it in 0(n 3 ), when F — 0(n), we first 
explode the source of supply F — l into F — 1 vertices of supply +1 and do the same with the sink of 
demand 1— F. The new graph has only sources of supply +1 and sinks of demand —1. Clearly there 
is a bijection between the min cost flows of the new and the original graph. Moreover a min cost 
flow matches sources to sinks such that the flow between a matched source/sink pair uses a shortest 
path (since the arcs have unbounded capacity) and such that the total distances are minimal. To 
obtain this flow we first compute the distances in the graph between all source/sink pairs, in time 
0(n 3 ) using Floyd-Marshall's algorithm. Then we construct the bi-partite sources/sinks graph, 
where every edge is weighted with the source-sink distance in the original graph. Then a minimum 
weighted perfect matching can be computed in time 0(n 3 ) provided F G 0(n), using Edmond's 
algorithm with adapted data-structures. The optimal flow then is obtained by adding a unit flow 
on the shortest path between source i and sink j, for every edge of the matching corresponding to 
source i and sink j. 
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Finally to get an optimal solution for the primal linear program, we use the standard technique 
of computing a shortest path tree in the residual graph obtained from the flow [TJ chapter 9] . □ 



5 Conclusion 

Further work would include trying to find other optimization problems where our technique may 
apply, and maybe generalize from it a general framework. We are also interested in improving the 
combinatorial algorithms that arise from the graph structures in the scheduling problems: indeed 
those graphs have, among others, the property that once the vertices drawn as points on a line, 
the arcs from left to right have positive weights and the ones from right to left negative. One idea 
for instance is to try and extract from Simons and Warmuth's algorithm a shortest path algorithm 
suitable for our class of graphs. 

We wish to thank Arthur Chargueraud, Philippe Baptistc, Miki Hermann and Leo Liberti for 
helpful comments. 



References 

[1] Ravindra K. Ahuja, Thomas L. Magnanti, and James B. Orlin. Network Flows : Theory, 
Algorithms and Applications. Prentice Hall, 1993. 

[2] S. Albers and M. Biittner. Integrated prefetching and caching in single and parallel disk 
systems. Information and Computation, 198:24-39, 2005. 

[3] S. Albers, N. Garg, and S. Leonardi. Minimizing stall time in single and parallel disk systems. 
Journal of the ACM, 47:969-986, 2000. 

[4] Philippe Baptiste and Baruch Schieber. A note on scheduling tall/small multiprocessor tasks 
with unit processing time to minimize maximum tardiness. Journal of Scheduling, 6(4):395- 
404, 2003. 

[5] Peter Brucker. Scheduling Algorithms. Springer, 2001. 

[6] Peter Brucker and Svetlana Kravchenko. Scheduling jobs with equal processing times and 
time windows on identical parallel machines. Journal on Scheduling, ll(4):229-237, 2008. 

[7] P. Cao, EW. Felten, A.R. Karlin, and K. Li. Implementation and performance of integrated 
application-controlled caching, prefetching and disk scheduling. ACM Transaction of Com- 
puter Systems, pages 188-196, 1995. 

[8] Philips Hall. On representatives of subsets. J. London Math. Soc, 10:26-30, 1935. 

[9] James B. Orlin. A faster strongly polynomial algorithm for the minimum cost flow problem. 
Operations Research, 41:338-350, 1993. 

[10] B. Simons. A fast algorithm for single processor scheduling. In Proceedings IEEE 19th Annual 
Symposium on Foundations of Computer Science (FOCS'78), pages 246-252, 1978. 

[11] Barbara Simons and Manfred Warmuth. A fast algorithm for multiprocessor scheduling of 
unit-length jobs. SI AM Journal on Computing, 18(4):690-710, 1989. 

[12] Jan van Leeuwen. HandBook of Theoretical Computer Science, volume A: Algorithms and 
Complexity. Elsevier, 1990. 



10 



